A Fuzzy Hashing Approach Based on Random Sequences and Hamming Distance

نویسندگان

  • Frank Breitinger
  • Harald Baier
چکیده

Hash functions are well-known methods in computer science to map arbitrary large input to bit strings of a fixed length that serve as unique input identifier/fingerprints. A key property of cryptographic hash functions is that even if only one bit of the input is changed the output behaves pseudo randomly and therefore similar files cannot be identified. However, in the area of computer forensics it is also necessary to find similar files (e.g. different versions of a file), wherefore we need a similarity preserving hash function also called fuzzy hash function. In this paper we present a new approach for fuzzy hashing called bbHash. It is based on the idea to ‘rebuild’ an input as good as possible using a fixed set of randomly chosen byte sequences called building blocks of byte length l (e.g. l= 128 ). The proceeding is as follows: slide through the input byte-by-byte, read out the current input byte sequence of length l , and compute the Hamming distances of all building blocks against the current input byte sequence. Each building block with Hamming distance smaller than a certain threshold contributes the file’s bbHash. We discuss (dis)advantages of our bbHash to further fuzzy hash approaches. A key property of bbHash is that it is the first fuzzy hashing approach based on a comparison to external data structures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compressed Image Hashing using Minimum Magnitude CSLBP

Image hashing allows compression, enhancement or other signal processing operations on digital images which are usually acceptable manipulations. Whereas, cryptographic hash functions are very sensitive to even single bit changes in image. Image hashing is a sum of important quality features in quantized form. In this paper, we proposed a novel image hashing algorithm for authentication which i...

متن کامل

Random projections weakly preserving the Hamming distance between words

Random projections in the Euclidean space reduce the dimensionality of the data approximately preserving the distances between points. In the hypercube it holds a weaker property: random projections approximately preserve the distances within a certain range. In this note, we show an analogous result for the metric space 〈 Σ, dH 〉 , where Σ is the set of words of length d on alphabet Σ and dH i...

متن کامل

Query-adaptive Image Retrieval by Deep Weighted Hashing

The hashing methods have attracted much attention for large scale image retrieval. Some deep hashing methods have achieved promising results by taking advantage of the better representation power of deep networks recently. However, existing deep hashing methods treat all hash bits equally. On one hand, a large number of images share the same distance to a query image because of the discrete Ham...

متن کامل

Solving New Product Selection Problem by a New Hierarchical Group Decision-making Approach with Hesitant Fuzzy Setting

Selecting the most suitable alternative under uncertainty is considered as a critical decision-making problem that affects the success of organizations. In the selection process, there are a number of assessment criteria, considered by a group of decision makers, which often could be established in a multi-level hierarchy structure. The aim of this paper is to introduce a new hierarchical multi...

متن کامل

Inverse Maximum Dynamic Flow Problem under the Sum-Type Weighted Hamming Distance

Inverse maximum flow (IMDF), is among the most important problems in the field ofdynamic network flow, which has been considered the Euclidean norms measure in previousresearches. However, recent studies have mainly focused on the inverse problems under theHamming distance measure due to their practical and important applications. In this paper,we studies a general approach for handling the inv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012